Search | WHO COVID-19 Research Database

1.

PREFMoDeL: A Systematic Review and Proposed Taxonomy of Biomolecular Features for Deep Learning

North, Jacob L.; Hsu, Victor L..

Applied Sciences ; 13(7):4356, 2023.

Article in English | ProQuest Central | ID: covidwho-2301015

ABSTRACT

Of fundamental importance in biochemical and biomedical research is understanding a molecule's biological properties—its structure, its function(s), and its activity(ies). To this end, computational methods in Artificial Intelligence, in particular Deep Learning (DL), have been applied to further biomolecular understanding—from analysis and prediction of protein–protein and protein–ligand interactions to drug discovery and design. While choosing the most appropriate DL architecture is vitally important to accurately model the task at hand, equally important is choosing the features used as input to represent molecular properties in these DL models. Through hypothesis testing, bioinformaticians have created thousands of engineered features for biomolecules such as proteins and their ligands. Herein we present an organizational taxonomy for biomolecular features extracted from 808 articles from across the scientific literature. This objective view of biomolecular features can reduce various forms of experimental and/or investigator bias and additionally facilitate feature selection in biomolecular analysis and design tasks. The resulting dataset contains 1360 nondeduplicated features, and a sample of these features were classified by their properties, clustered, and used to suggest new features. The complete feature dataset (the Public Repository of Engineered Features for Molecular Deep Learning, PREFMoDeL) is released for collaborative sourcing on the web.

2.

Distinguishing features of fold-switching proteins.

Chakravarty, Devlina; Schafer, Joseph W; Porter, Lauren L.

Protein Sci ; 32(3): e4596, 2023 03.

Article in English | MEDLINE | ID: covidwho-2239627

ABSTRACT

Though many folded proteins assume one stable structure that performs one function, a small-but-increasing number remodel their secondary and tertiary structures and change their functions in response to cellular stimuli. These fold-switching proteins regulate biological processes and are associated with autoimmune dysfunction, severe acute respiratory syndrome coronavirus-2 infection, and more. Despite their biological importance, it is difficult to computationally predict fold switching. With the aim of advancing computational prediction and experimental characterization of fold switchers, this review discusses several features that distinguish fold-switching proteins from their single-fold and intrinsically disordered counterparts. First, the isolated structures of fold switchers are less stable and more heterogeneous than single folders but more stable and less heterogeneous than intrinsically disordered proteins (IDPs). Second, the sequences of single fold, fold switching, and intrinsically disordered proteins can evolve at distinct rates. Third, proteins from these three classes are best predicted using different computational techniques. Finally, late-breaking results suggest that single folders, fold switchers, and IDPs have distinct patterns of residue-residue coevolution. The review closes by discussing high-throughput and medium-throughput experimental approaches that might be used to identify new fold-switching proteins.

Subject(s)

COVID-19 , Intrinsically Disordered Proteins , Humans , Intrinsically Disordered Proteins/chemistry , Protein Folding , Models, Molecular

3.

Computation-aided novel epitope prediction by targeting spike protein's functional dynamics in Omicron

Sun, Bin, Zhang, Yong, Yang, Baofeng.

Frigid Zone Medicine ; 3(1):1-4, 2023.

Article in English | Academic Search Complete | ID: covidwho-2224701

4.

Examples of Structural Motifs in Viral Genomes and Approaches for RNA Structure Characterization.

Nalewaj, Maria; Szabat, Marta.

Int J Mol Sci ; 23(24)2022 Dec 14.

Article in English | MEDLINE | ID: covidwho-2163440

ABSTRACT

The relationship between conserved structural motifs and their biological function in the virus replication cycle is the interest of many researchers around the world. RNA structure is closely related to RNA function. Therefore, technological progress in high-throughput approaches for RNA structure analysis and the development of new ones are very important. In this mini review, we discuss a few perspectives on the structural elements of viral genomes and some methods used for RNA structure prediction and characterization. Based on the recent literature, we describe several examples of studies concerning the viral genomes, especially severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and influenza A virus (IAV). Herein, we emphasize that a better understanding of viral genome architecture allows for the discovery of the structure-function relationship, and as a result, the discovery of new potential antiviral therapeutics.

Subject(s)

COVID-19 , SARS-CoV-2 , Humans , SARS-CoV-2/genetics , COVID-19/genetics , Genome, Viral , RNA, Viral/genetics , RNA, Viral/chemistry , Antiviral Agents , Virus Replication/genetics

5.

Predicting the Assembly of the Transmembrane Domains of Viral Channel Forming Proteins and Peptide Drug Screening Using a Docking Approach.

Huang, Ta-Chou; Fischer, Wolfgang B.

Biomolecules ; 12(12)2022 12 10.

Article in English | MEDLINE | ID: covidwho-2154889

ABSTRACT

A de novo assembly algorithm is provided to propose the assembly of bitopic transmembrane domains (TMDs) of membrane proteins. The algorithm is probed using, in particular, viral channel forming proteins (VCPs) such as M2 of influenza A virus, E protein of severe acute respiratory syndrome corona virus (SARS-CoV), 6K of Chikungunya virus (CHIKV), SH of human respiratory syncytial virus (hRSV), and Vpu of human immunodeficiency virus type 2 (HIV-2). The generation of the structures is based on screening a 7-dimensional space. Assembly of the TMDs can be achieved either by simultaneously docking the individual TMDs or via a sequential docking. Scoring based on estimated binding energies (EBEs) of the oligomeric structures is obtained by the tilt to decipher the handedness of the bundles. The bundles match especially well for all-atom models of M2 referring to an experimentally reported tetrameric bundle. Docking of helical poly-peptides to experimental structures of M2 and E protein identifies improving EBEs for positively charged (K,R,H) and aromatic amino acids (F,Y,W). Data are improved when using polypeptides for which the coordinates of the amino acids are adapted to the Cα coordinates of the respective experimentally derived structures of the TMDs of the target proteins.

Subject(s)

Molecular Docking Simulation , Peptides , Viroporin Proteins , Humans , Drug Evaluation, Preclinical , Peptides/chemistry , Protein Structure, Tertiary , Viroporin Proteins/chemistry , Protein Domains

6.

Computational Pipeline for Reference-Free Comparative Analysis of RNA 3D Structures Applied to SARS-CoV-2 UTR Models.

Gumna, Julita; Antczak, Maciej; Adamiak, Ryszard W; Bujnicki, Janusz M; Chen, Shi-Jie; Ding, Feng; Ghosh, Pritha; Li, Jun; Mukherjee, Sunandan; Nithin, Chandran; Pachulska-Wieczorek, Katarzyna; Ponce-Salvatierra, Almudena; Popenda, Mariusz; Sarzynska, Joanna; Wirecki, Tomasz; Zhang, Dong; Zhang, Sicheng; Zok, Tomasz; Westhof, Eric; Miao, Zhichao; Szachniuk, Marta; Rybarczyk, Agnieszka.

Int J Mol Sci ; 23(17)2022 Aug 25.

Article in English | MEDLINE | ID: covidwho-2006037

ABSTRACT

RNA is a unique biomolecule that is involved in a variety of fundamental biological functions, all of which depend solely on its structure and dynamics. Since the experimental determination of crystal RNA structures is laborious, computational 3D structure prediction methods are experiencing an ongoing and thriving development. Such methods can lead to many models; thus, it is necessary to build comparisons and extract common structural motifs for further medical or biological studies. Here, we introduce a computational pipeline dedicated to reference-free high-throughput comparative analysis of 3D RNA structures. We show its application in the RNA-Puzzles challenge, in which five participating groups attempted to predict the three-dimensional structures of 5'- and 3'-untranslated regions (UTRs) of the SARS-CoV-2 genome. We report the results of this puzzle and discuss the structural motifs obtained from the analysis. All simulated models and tools incorporated into the pipeline are open to scientific and academic use.

Subject(s)

COVID-19 , RNA , 3' Untranslated Regions , Humans , Nucleic Acid Conformation , RNA/chemistry , SARS-CoV-2

7.

A Uniquely Stable Trimeric Model of SARS-CoV-2 Spike Transmembrane Domain.

Aliper, Elena T; Krylov, Nikolay A; Nolde, Dmitry E; Polyansky, Anton A; Efremov, Roman G.

Int J Mol Sci ; 23(16)2022 Aug 17.

Article in English | MEDLINE | ID: covidwho-1987839

ABSTRACT

Understanding fusion mechanisms employed by SARS-CoV-2 spike protein entails realistic transmembrane domain (TMD) models, while no reliable approaches towards predicting the 3D structure of transmembrane (TM) trimers exist. Here, we propose a comprehensive computational framework to model the spike TMD only based on its primary structure. We performed amino acid sequence pattern matching and compared the molecular hydrophobicity potential (MHP) distribution on the helix surface against TM homotrimers with known 3D structures and selected an appropriate template for homology modeling. We then iteratively built a model of spike TMD, adjusting "dynamic MHP portraits" and residue variability motifs. The stability of this model, with and without palmitoyl modifications downstream of the TMD, and several alternative configurations (including a recent NMR structure), was tested in all-atom molecular dynamics simulations in a POPC bilayer mimicking the viral envelope. Our model demonstrated unique stability under the conditions applied and conforms to known basic principles of TM helix packing. The original computational framework looks promising and could potentially be employed in the construction of 3D models of TM trimers for a wide range of membrane proteins.

Subject(s)

SARS-CoV-2 , Spike Glycoprotein, Coronavirus , Molecular Dynamics Simulation , Protein Domains , Spike Glycoprotein, Coronavirus/chemistry

8.

Methods and applications of machine learning in structure-based drug discovery

Sanjeevi, M.; Hebbar, P. N.; Aiswarya, N.; Rashmi, S.; Rahul, C. N.; Mohan, A.; Jeyakanthan, J.; Sekar, K..

Advances in Protein Molecular and Structural Biology Methods ; : 405-437, 2022.

Article in English | Scopus | ID: covidwho-1859219

ABSTRACT

Structure-based drug discovery (SBDD) utilizes the three-dimensional (3D) structure of a target protein to identify the lead compounds. This medium is then considered a viable solution based on its availability and correlation with a particular disease. In the case of pandemics like COVID 19, shortening drug development time can save millions of people worldwide;for such a task, classical drug discovery methods will take a long time. Hence, researchers worldwide actively incorporated machine learning (ML) into the drug discovery process, particularly in SBDD, to minimize the lead optimization time. ML uses statistical methods to make a computer perform tasks, take a critical decision, and automate this entire process without being explicitly programmed. With this, the computer can discover new insights about data and unknown patterns crucial to decide the therapeutic use of lead compounds as drugs. The use of ML in the drug discovery field is not new, and it spans an ample research space. By integrating artificial intelligence with ML techniques, viable targets can be found using data clustering, regression, and classification from vast omics databases and sources. In this chapter, we will discuss the methods and applications of ML in SBDD. © 2022 Elsevier Inc. All rights reserved.

9.

Development of a multi-epitope spike glycoprotein vaccine to combat SARS-CoV-2 using the bioinformatics approach

Shehzad, A.; Sumartono, C.; Nugraha, J.; Susilowati, H.; Wijaya, A. Y.; Ahmad, H. I.; Kashif, M.; Tyasningsih, W.; Rantam, F. A..

Journal of Pharmacy & Pharmacognosy Research ; 10(3):445-458, 2022.

Article in English | Web of Science | ID: covidwho-1749746

ABSTRACT

Context: The current COVID-19 pandemic has significantly impacted health and socio-economic status worldwide. The only way to combat this situation is to develop an effective vaccine and immunize people around the globe. Aims: To construct a multi-epitope spike glycoprotein-based vaccine from the SARS-CoV-2 Surabaya isolate using a bioinformatics approach. Methods: The spike protein was submitted to IEDB, VaxiJen, AllerTOP, and ToxinPred webservers to predict antigenic, non-allergic, non-toxic, B- and T-cell epitopes. To develop a multi-epitope vaccine, an adjuvant cholera toxin B subunit was linked to B-cell and B-cell with T-cell through EAAAK and GPGPG linkers, respectively. The designed vaccine 3D structure development, refinement, and validation were done through PHYRE2, Galaxy Refine, and RAMPAGE webservers. Moreover, the Cluspro-2.0 webserver was used for the molecular docking of the vaccine designed with TLR3. The vaccine+TLR3 complex was docked with Surfactant protein A as a control to validate the docking results. Finally, immune-simulation and in silico cloning of the vaccine were carried out by C-ImmSim webserver and SnapGene software, respectively. Results: A multi-epitopic vaccine containing B and T-cell was developed using 392 amino acids with a molecular weight of 40825.59 Da. The docking and immunogenicity results of the vaccine met all established parameters for constructing a quality vaccine. Furthermore, the optimized sequence of the vaccine was successfully cloned in expression vector pET 28 a (+) that yielded a colon of 2724 bp. Conclusions: The vaccine's immunogenicity demonstrates its effectiveness against SARS-CoV-2 infection. Further confirmatory testing may therefore be performed as soon as possible in the public interest.

10.

Computational Design of Potential Binder Protein for SARS-CoV-2 Spike RBD through A Novel Deep Neural Network Based-Protein Outpainting Algorithm

Duan, B.; Sun, Y..

5th International Conference on Biological Information and Biomedical Engineering, BIBE 2021 ; 2021.

Article in English | Scopus | ID: covidwho-1566381

ABSTRACT

COVID-19 caused by SARS-CoV-2 is seriously endangering the health of all human beings. There is an urgent need for drugs that can inhibit the replication and propagation of the virus. Traditional macromolecular drugs have long discovery and development cycles and high experimental costs, which can't give rapid response to new viruses. Through computational protein design method, scientists have designed binder proteins with high affinity for the RBD of SARS-CoV-2 spike protein which can effectively inhibit virus replication. However, traditional computational protein design methods rely heavily on human experience and domain knowledge of protein design, and the protein design workflow is too complicated to be widely accepted and used in academia and industry. Based on previous work in the field of deep neural network protein structure prediction and protein design, we developed a novel protein outpainting method that can generate the remaining part of the protein based on a given hot spot motif and complete the entire protein. This method can generate stable protein scaffold which can support the functional hot spot motif, resulting in a protein with excellent thermal stability and developability. We tested this method in a drug discovery project with the aim of designing new SARS-CoV-2 inhibitors. Several proteins are obtained which are predicted to be stable and may have high affinity for the RBD of the SARS-CoV-2 spike protein. Although they have not been verified by wet-lab experiments, we believe that these proteins have great potential to be developed into effective drugs for the treatment of COVID-19. The protein outpainting algorithm proposed in this paper has great advantages over traditional protein design methods. It can be applied to many fields that require the design of functional proteins, such as protein drug design, enzyme de novo design, vaccine design, etc. The method will play an important role in reducing the cost of experiments, shortening the research and development period, and improving the successful rate of biological research and development. © 2021 ACM.

11.

Conformational variability of loops in the SARS-CoV-2 spike protein.

Wong, Samuel W K; Liu, Zongjun.

Proteins ; 90(3): 691-703, 2022 03.

Article in English | MEDLINE | ID: covidwho-1469554

ABSTRACT

The SARS-CoV-2 spike (S) protein facilitates viral infection, and has been the focus of many structure determination efforts. Its flexible loop regions are known to be involved in protein binding and may adopt multiple conformations. This article identifies the S protein loops and studies their conformational variability based on the available Protein Data Bank structures. While most loops had essentially one stable conformation, 17 of 44 loop regions were observed to be structurally variable with multiple substantively distinct conformations based on a cluster analysis. Loop modeling methods were then applied to the S protein loop targets, and the prediction accuracies discussed in relation to the characteristics of the conformational clusters identified. Loops with multiple conformations were found to be challenging to model based on a single structural template.

Subject(s)

COVID-19/virology , SARS-CoV-2/chemistry , Spike Glycoprotein, Coronavirus/chemistry , Cluster Analysis , Humans , Models, Molecular , Protein Conformation

12.

Modeling SARS-CoV-2 proteins in the CASP-commons experiment.

Kryshtafovych, Andriy; Moult, John; Billings, Wendy M; Della Corte, Dennis; Fidelis, Krzysztof; Kwon, Sohee; Olechnovic, Kliment; Seok, Chaok; Venclovas, Ceslovas; Won, Jonghun.

Proteins ; 89(12): 1987-1996, 2021 12.

Article in English | MEDLINE | ID: covidwho-1449944

ABSTRACT

Critical Assessment of Structure Prediction (CASP) is an organization aimed at advancing the state of the art in computing protein structure from sequence. In the spring of 2020, CASP launched a community project to compute the structures of the most structurally challenging proteins coded for in the SARS-CoV-2 genome. Forty-seven research groups submitted over 3000 three-dimensional models and 700 sets of accuracy estimates on 10 proteins. The resulting models were released to the public. CASP community members also worked together to provide estimates of local and global accuracy and identify structure-based domain boundaries for some proteins. Subsequently, two of these structures (ORF3a and ORF8) have been solved experimentally, allowing assessment of both model quality and the accuracy estimates. Models from the AlphaFold2 group were found to have good agreement with the experimental structures, with main chain GDT_TS accuracy scores ranging from 63 (a correct topology) to 87 (competitive with experiment).

Subject(s)

SARS-CoV-2/chemistry , Viral Proteins/chemistry , COVID-19/virology , Genome, Viral , Humans , Models, Molecular , Protein Conformation , Protein Domains , SARS-CoV-2/genetics , Viral Proteins/genetics , Viroporin Proteins/chemistry , Viroporin Proteins/genetics

13.

RPocket: an intuitive database of RNA pocket topology information with RNA-ligand data resources.

Zhou, Ting; Wang, Huiwen; Zeng, Chen; Zhao, Yunjie.

BMC Bioinformatics ; 22(1): 428, 2021 Sep 08.

Article in English | MEDLINE | ID: covidwho-1405916

ABSTRACT

BACKGROUND: RNA regulates a variety of biological functions by interacting with other molecules. The ligand often binds in the RNA pocket to trigger structural changes or functions. Thus, it is essential to explore and visualize the RNA pocket to elucidate the structural and recognition mechanism for the RNA-ligand complex formation. RESULTS: In this work, we developed one user-friendly bioinformatics tool, RPocket. This database provides geometrical size, centroid, shape, secondary structure element for RNA pocket, RNA-ligand interaction information, and functional sites. We extracted 240 RNA pockets from 94 non-redundant RNA-ligand complex structures. We developed RPDescriptor to calculate the pocket geometrical property quantitatively. The geometrical information was then subjected to RNA-ligand binding analysis by incorporating the sequence, secondary structure, and geometrical combinations. This new approach takes advantage of both the atom-level precision of the structure and the nucleotide-level tertiary interactions. The results show that the higher-level topological pattern indeed improves the tertiary structure prediction. We also proposed a potential mechanism for RNA-ligand complex formation. The electrostatic interactions are responsible for long-range recognition, while the Van der Waals and hydrophobic contacts for short-range binding and optimization. These interaction pairs can be considered as distance constraints to guide complex structural modeling and drug design. CONCLUSION: RPocket database would facilitate RNA-ligand engineering to regulate the complex formation for biological or medical applications. RPocket is available at http://zhaoserver.com.cn/RPocket/RPocket.html .

Subject(s)

Computational Biology , RNA , Binding Sites , Ligands , Protein Structure, Secondary , RNA/genetics

14.

Probing the Increased Virulence of Severe Acute Respiratory Syndrome Coronavirus 2 B.1.617 (Indian Variant) From Predicted Spike Protein Structure.

Hajj-Hassan, Houssein; Hamze, Kassem; Abdel Sater, Fadi; Kizilbash, Nadeem; Khachfe, Hassan M.

Cureus ; 13(8): e16905, 2021 Aug.

Article in English | MEDLINE | ID: covidwho-1374645

ABSTRACT

Coronavirus disease 2019 (COVID-19), caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has led to an outbreak of a pandemic worldwide. The spike (S) protein of SARS-CoV-2, which plays a key role in the receptor recognition and cell membrane fusion process, is composed of two subunits, S1 and S2. The S1 subunit contains a receptor-binding domain that recognizes and binds to the host receptor angiotensin-converting enzyme 2 (ACE2), while the S2 subunit mediates viral cell membrane fusion with the cell membrane and subsequent entry into cells. Mutations in the spike protein (S) are of particular interest due to their potential for reduced susceptibility to neutralizing antibodies or increasing the viral transmissibility and infectivity. Recently, many mutations in the spike protein released new variants, including the Delta and Kappa ones (known as the Indian variants). The variants Delta and Kappa are now of most recent concern because of their well-increased infectivity, both a spin-off of the B.1.617 lineage, which was first identified in India in October 2020. This study employed homology modeling to probe the potential structural effects of the mutations. It was found that the mutations, Leu452Arg, Thr478Lys, and Glu484Gln in the spike protein increase the affinity for the hACE2 receptor, which explains the greater infectivity of the SARS-Cov-2 B.1.617 (Indian Variant).

15.

Biomolecular Modeling and Simulation: A Prospering Multidisciplinary Field.

Schlick, Tamar; Portillo-Ledesma, Stephanie; Myers, Christopher G; Beljak, Lauren; Chen, Justin; Dakhel, Sami; Darling, Daniel; Ghosh, Sayak; Hall, Joseph; Jan, Mikaeel; Liang, Emily; Saju, Sera; Vohr, Mackenzie; Wu, Chris; Xu, Yifan; Xue, Eva.

Annu Rev Biophys ; 50: 267-301, 2021 05 06.

Article in English | MEDLINE | ID: covidwho-1348196

ABSTRACT

We reassess progress in the field of biomolecular modeling and simulation, following up on our perspective published in 2011. By reviewing metrics for the field's productivity and providing examples of success, we underscore the productive phase of the field, whose short-term expectations were overestimated and long-term effects underestimated. Such successes include prediction of structures and mechanisms; generation of new insights into biomolecular activity; and thriving collaborations between modeling and experimentation, including experiments driven by modeling. We also discuss the impact of field exercises and web games on the field's progress. Overall, we note tremendous success by the biomolecular modeling community in utilization of computer power; improvement in force fields; and development and application of new algorithms, notably machine learning and artificial intelligence. The combined advances are enhancing the accuracy andscope of modeling and simulation, establishing an exemplary discipline where experiment and theory or simulations are full partners.

Subject(s)

Computer Simulation , Algorithms

16.

The impact of structural bioinformatics tools and resources on SARS-CoV-2 research and therapeutic strategies.

Waman, Vaishali P; Sen, Neeladri; Varadi, Mihaly; Daina, Antoine; Wodak, Shoshana J; Zoete, Vincent; Velankar, Sameer; Orengo, Christine.

Brief Bioinform ; 22(2): 742-768, 2021 03 22.

Article in English | MEDLINE | ID: covidwho-1343644

ABSTRACT

SARS-CoV-2 is the causative agent of COVID-19, the ongoing global pandemic. It has posed a worldwide challenge to human health as no effective treatment is currently available to combat the disease. Its severity has led to unprecedented collaborative initiatives for therapeutic solutions against COVID-19. Studies resorting to structure-based drug design for COVID-19 are plethoric and show good promise. Structural biology provides key insights into 3D structures, critical residues/mutations in SARS-CoV-2 proteins, implicated in infectivity, molecular recognition and susceptibility to a broad range of host species. The detailed understanding of viral proteins and their complexes with host receptors and candidate epitope/lead compounds is the key to developing a structure-guided therapeutic design. Since the discovery of SARS-CoV-2, several structures of its proteins have been determined experimentally at an unprecedented speed and deposited in the Protein Data Bank. Further, specialized structural bioinformatics tools and resources have been developed for theoretical models, data on protein dynamics from computer simulations, impact of variants/mutations and molecular therapeutics. Here, we provide an overview of ongoing efforts on developing structural bioinformatics tools and resources for COVID-19 research. We also discuss the impact of these resources and structure-based studies, to understand various aspects of SARS-CoV-2 infection and therapeutic development. These include (i) understanding differences between SARS-CoV-2 and SARS-CoV, leading to increased infectivity of SARS-CoV-2, (ii) deciphering key residues in the SARS-CoV-2 involved in receptor-antibody recognition, (iii) analysis of variants in host proteins that affect host susceptibility to infection and (iv) analyses facilitating structure-based drug and vaccine design against SARS-CoV-2.

Subject(s)

Antiviral Agents/therapeutic use , COVID-19 Drug Treatment , Computational Biology , SARS-CoV-2/isolation & purification , COVID-19/virology , Humans , Protein Conformation , Viral Proteins/chemistry

17.

Atypical Divergence of SARS-CoV-2 Orf8 from Orf7a within the Coronavirus Lineage Suggests Potential Stealthy Viral Strategies in Immune Evasion.

Neches, Russell Y; Kyrpides, Nikos C; Ouzounis, Christos A.

mBio ; 12(1)2021 01 19.

Article in English | MEDLINE | ID: covidwho-1066821

ABSTRACT

Orf8, one of the most puzzling genes in the SARS lineage of coronaviruses, marks a unique and striking difference in genome organization between SARS-CoV-2 and SARS-CoV-1. Here, using sequence comparisons, we unequivocally reveal the distant sequence similarities between SARS-CoV-2 Orf8 with its SARS-CoV-1 counterparts and the X4-like genes of coronaviruses, including its highly divergent "paralog" gene Orf7a, whose product is a potential immune antagonist of known structure. Supervised sequence space walks unravel identity levels that drop below 10% and yet exhibit subtle conservation patterns in this novel superfamily, characterized by an immunoglobulin-like beta sandwich topology. We document the high accuracy of the sequence space walk process in detail and characterize the subgroups of the superfamily in sequence space by systematic annotation of gene and taxon groups. While SARS-CoV-1 Orf7a and Orf8 genes are most similar to bat virus sequences, their SARS-CoV-2 counterparts are closer to pangolin virus homologs, reflecting the fine structure of conservation patterns within the SARS-CoV-2 genomes. The divergence between Orf7a and Orf8 is exceptionally idiosyncratic, since Orf7a is more constrained, whereas Orf8 is subject to rampant change, a peculiar feature that may be related to hitherto-unknown viral infection strategies. Despite their common origin, the Orf7a and Orf8 protein families exhibit different modes of evolutionary trajectories within the coronavirus lineage, which might be partly attributable to their complex interactions with the mammalian host cell, reflected by a multitude of functional associations of Orf8 in SARS-CoV-2 compared to a very small number of interactions discovered for Orf7a.IMPORTANCE Orf8 is one of the most puzzling genes in the SARS lineage of coronaviruses, including SARS-CoV-2. Using sophisticated sequence comparisons, we confirm its origins from Orf7a, another gene in the lineage that appears as more conserved, compared to Orf8. Orf7a is a potential immune antagonist of known structure, while a deletion of Orf8 was shown to decrease the severity of the infection in a cohort study. The subtle sequence similarities imply that Orf8 has the same immunoglobulin-like fold as Orf7a, confirmed by structure determination. We characterize the subgroups of this superfamily and demonstrate the highly idiosyncratic divergence patterns during the evolution of the virus.

Subject(s)

COVID-19/immunology , Immune Evasion , SARS-CoV-2/genetics , SARS-CoV-2/immunology , Viral Proteins/immunology , Animals , COVID-19/virology , Databases, Genetic , Evolution, Molecular , Genome, Viral , Humans , Phylogeny , Sequence Alignment , Viral Proteins/genetics

18.

A recent origin of Orf3a from M protein across the coronavirus lineage arising by sharp divergence.

Ouzounis, Christos A.

Comput Struct Biotechnol J ; 18: 4093-4102, 2020.

Article in English | MEDLINE | ID: covidwho-957002

ABSTRACT

The genome of SARS-CoV-2, the coronavirus responsible for the Covid-19 pandemic, encodes a number of accessory genes. The longest accessory gene, Orf3a, plays important roles in the virus lifecycle indicated by experimental findings, known polymorphisms, its evolutionary trajectory and a distinct three-dimensional fold. Here we show that supervised, sensitive database searches with Orf3a detect weak, yet significant and highly specific similarities to the M proteins of coronaviruses. The similarity profiles can be used to derive low-resolution three-dimensional models for M proteins based on Orf3a as a structural template. The models also explain the emergence of Orf3a from M proteins and suggest a recent origin across the coronavirus lineage, enunciated by its restricted phylogenetic distribution. This study provides evidence for the common origin of M and Orf3a families and proposes for the first time a working model for the structure of the universally distributed M proteins in coronaviruses, consistent with the properties of both protein families.

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL